The Sum-Product Theorem: A Foundation for Learning Tractable Models
نویسندگان
چکیده
Inference in expressive probabilistic models is generally intractable, which makes them difficult to learn and limits their applicability. Sumproduct networks are a class of deep models where, surprisingly, inference remains tractable even when an arbitrary number of hidden layers are present. In this paper, we generalize this result to a much broader set of learning problems: all those where inference consists of summing a function over a semiring. This includes satisfiability, constraint satisfaction, optimization, integration, and others. In any semiring, for summation to be tractable it suffices that the factors of every product have disjoint scopes. This unifies and extends many previous results in the literature. Enforcing this condition at learning time thus ensures that the learned models are tractable. We illustrate the power and generality of this approach by applying it to a new type of structured prediction problem: learning a nonconvex function that can be globally optimized in polynomial time. We show empirically that this greatly outperforms the standard approach of learning without regard to the cost of optimization.
منابع مشابه
The Sum-Product Theorem: A Foundation for Learning Tractable Models (Supplementary Material)
Let S(X) be a decomposable SPF with size |S| on commutative semiring (R,⊕,⊗, 0, 1), let d = |Xi| for all Xi ∈ X where X = (X1, . . . , Xn), and let the cost of a ⊕ b and a ⊗ b for any elements a, b ∈ R be c. Further, let e denote the complexity of evaluating any unary leaf function φj(Xi) in S and let k = maxv∈Ssum,j∈Ch(v) |Xv\Xj | < n, where Ssum, Sprod, and Sleaf are the sum, product, and lea...
متن کاملLearning Relational Sum-Product Networks
Sum-product networks (SPNs) are a recently-proposed deep architecture that guarantees tractable inference, even on certain high-treewidth models. SPNs are a propositional architecture, treating the instances as independent and identically distributed. In this paper, we introduce Relational SumProduct Networks (RSPNs), a new tractable first-order probabilistic architecture. RSPNs generalize SPNs...
متن کاملDirectly Learning Tractable Models for Sequential Inference and DecisionMaking
Probabilistic graphical models such as Bayesian networks and Markov networks provide a general framework to represent multivariate distributions while exploiting conditional independence. Over the years, many approaches have been proposed to learn the structure of those networks Heckerman et al. (1995); Neapolitan (2004). However, even if the resulting network is small, inference may be intract...
متن کاملLearning the Structure of Sum-Product Networks via an SVD-based Algorithm
Sum-product networks (SPNs) are a recently developed class of deep probabilistic models where inference is tractable. We present two new structure learning algorithms for sum-product networks, in the generative and discriminative settings, that are based on recursively extracting rank-one submatrices from data. The proposed algorithms find the subSPNs that are the most coherent jointly in the i...
متن کاملLearning Sum-Product Networks with Direct and Indirect Variable Interactions
Sum-product networks (SPNs) are a deep probabilistic representation that allows for efficient, exact inference. SPNs generalize many other tractable models, including thin junction trees, latent tree models, and many types of mixtures. Previous work on learning SPN structure has mainly focused on using top-down or bottom-up clustering to find mixtures, which capture variable interactions indire...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016